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Abstract. For a function f : X M., a point is critical if its derivatives are 
zero, and a critical point is a saddle point if it is not a local extrema. In this 
paper, we study algorithms to find saddle points of general Morse index. Our 
approach is motivated by the multidimensional mountain pass theorem, and 
extends our earlier work on methods (based on studying the level sets of /) 
to find saddle points of mountain pass type. We prove the convergence of our 
algorithms in the nonsmooth case, and the local superlinear convergence of 
another algorithm in the smooth finite dimensional case. 



For a function f : X —>■ M., we say that a; is a critical point if V/(a;) = 0, and 
y is a critical value if there is some critical point x such that /(x) = y. A critical 
point x is a saddle point if it is neither a local minimizer nor a local maximizer. 
In this paper, we present algorithms based on the multidimensional mountain pass 
theorem to find saddle points numerically. 

The main purpose of critical point theory is the study of variational problems. 
These are problems (P) such that there exists a smooth functional $ : X ^ M 
whose critical points are solutions of (P). Variational problems occur frequently in 
the study of partial differential equations. 
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1. Introduction 
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At this point, we make a remark about saddle points in the study of min-max 
problems. Such saddle points occur in problems in game theory and in constrained 
optimization using the Lagrangian, and have the splitting structure 

min max f(x. v). 

xex yeY 

In min-max problems, this splitting structure is exploited in numerical procedures. 
See |15j for a survey of algorithms for min-max problems. In the general case, for 
example in finding weak solutions of partial differential equations, such a splitting 
structure may only be obtained after the saddle point is located, and thus is not 
helpful for finding the saddle point. 

A critical point x is nondegenerate if its Hessian V^/(a;) is nonsingular and it is 
degenerate otherwise. The Morse index of a critical point is the maximal dimension 
of a subspace of X on which the Hessian V^/(a;) is negative definite. In the finite 
dimensional case, the Morse index is the number of negative eigenvalues of the 
Hessian. 

Local maximizers and minimizers of / : X — >■ M are easily found using optimiza- 
tion, while saddle points are harder to find. To find saddle points of Morse index 
1, one can use algorithms motivated by the mountain pass theorem. Given points 
a,b £ X , define a mountain pass p* G r(a, b) to be a minimizer of the problem 

inf sup f{p{t)), 
per(a,fc) o<t<i 

if it exists. Here, r(a, b) is the set of continuous paths p : [0,1] X such that p{0) = 
a andp(l) = b. Ambrosetti and Rabinowitz's [T mountain pass theorem states that 
under added conditions, there is a critical value of at least max{/(a), f{b)}. To find 
saddle points of higher Morse index, it is instructive to look at theorems establishing 
the existence of critical points of Morse index higher than 1. Rabinowitz proved 
the multidimensional mountain pass theorem which in turn motivated the study of 
linking methods to find saddle points. We shall recall theoretical material relevant 
for finding saddle points of higher Morse index in this paper as needed. 

While the study of numerical methods for the mountain pass problem began 
in the 70's or earlier to study problems in computational chemistry, Choi and 
McKenna |3] were the first to propose a numerical method for the mountain pass 
problem to solve variational problems. Most numerical methods for finding critical 
points of mountain pass type rely on discretizing paths in r(a, b) and perturbing 
paths to lower the maximum value of / on the path. There are a few other methods 
of finding saddle points of mountain pass type that do not involve perturbing paths, 
for example jHlll]. 

Saddle points of higher Morse index are obtained with modifications of the moun- 
tain pass algorithm. Ding, Costa and Chen [6] proposed a numerical method for 
finding critical points of Morse index 2, and Li and Zhou [13 proposed a method 
for finding critical points of higher Morse index. 

In |12| , we suggested a numerical method for finding saddle points of mountain 
pass type. The key observation is that the value 

sup {Z > max (/(a), | a, b lie in different path components of {x \ f{x) < /}} 

is a critical value. In other words, the supremum of all levels I such that there is no 
path connecting a and b in the level set {x \ f{x) < Z} is a critical value. See Figure 
|l.l| for an illustration of the difference between the two approaches. An extensive 
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theoretical analysis and some numerical results of this approach were provided in 

m- 

In this paper, we extend three of the themes in the level set approach to find 
saddle points of higher Morse index, namely the convergence of the basic algorithm 
(Sections [2] and [3| , optimality condition of sub-problem (Section |4]), and a fast lo- 
cally convergent method in M" (Sections |6] and [?]). Section [s] presents an alternative 
result on convergence to a critical point similar to that of Section [3j 

We refer the reader to [12 for examples reflecting the limitations of the level set 
approach for finding saddle points of mountain pass type, which will be relevant 
for the design of level set methods of finding saddle points of general Morse index. 




Figure 1.1. The diagram on the left shows the classical method of 
perturbing paths for the mountain pass problem, while the diagram 
on the right shows convergence to the critical point by looking at 
level sets. 



Notation 

lev>b/: This is the level set {x \ f{x) > b}, where f : X ^ R. The interpre- 
tations of lev<f,/ and lev^f,/ are similar. 

B: The ball of center and radius 1. B(a;, r) stands for a ball of center x and 
radius r. B" denotes the n-dimensional sphere in K". 

S": The n-dimensional sphere in 

d: Subdifferential of a real-valued function, or the relative boundary of a set. 

If /i : B" — > S* is a homeomorphism between B" and S, then the relative 

boundary of S is /i(S"~^). 
lin(^): For an afhne space A, the lineality space lin(A) is the space {a — a' \ 

a, a' e A}. 

2. Algorithm for critical points 

We look at the critical point existence theorems to give an insight on our algo- 
rithm for finding critical points of higher Morse index below. Here is the definition 
of linking sets. We take our definition from |171 Section II. 8]. 

Definition 2.1. (Linking) Let ^ be a subset of M", B a submanifold of M" with 
relative boundary dB. Then we say that A and dB link if 

(a) AndB = 0, and 

(b) for any continuous h : M" — > M" such that h \gB— id we have h{B) O A ^ (d. 
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Figure 2.1. Linking subsets 



Figure 2.1 illustrates two examples of linking subsets in M^. In the diagram on 
the left, the set A is the union of two points inside and outside the sphere B. In the 
diagram on the right, the sets A and B are the interlocking 'rings'. Note however 
that A and B link does not imply that B and A link, though this will be true with 
additional conditions. We hope this does not cause confusion. 

We now recall the Palais-Smale condition. 

Definition 2.2. (Palais-Smale condition) Let X be a Banach space and / : X — > M 
be C^. We say that a sequence {xi}°^i C X is a Palais-Smale sequence if {f{xi)}°^i 
is bounded and ^f{xi) — >■ 0, and / satisfies the Palais-Smale condition if any 
Palais-Smale sequence admits a convergent subsequence. 

The classical multidimensional pass theorem originally due to Rabinowitz [2] 
states that under added conditions, if there are linking sets A and B such that 
max^ / < miriB f and the Palais-Smale condition holds, then there is a critical 
value of at least max^ / for the case when / is smooth. (See Theorem 6.1 for a 



statement of the multidimensional mountain pass theorem) Generalizations in the 
nonsmooth case are also well-known in the literature. See for example [8^. 

To find saddle points of Morse index to, we consider finding a sequence of linking 
sets {Ai}^]^ and {Bi}°l^ such that diam(Ai), the diameter of the set Ai, decreases 
to zero, and the set Ai is a subset of an w-dimensional afhne space. This motivates 
the following algorithm. 

Algorithm 2.3. First algorithm for finding saddle points of Morse index to > 1. 

(1) Set the iteration count i to 0, and let li be a lower hound of the critical 
value and Ui he an upper hound. 

(2) Find Xi and yi, where {Si,Xi,yi) is an optimizing triple of 

(2.1) min max Ix — yl, 

where Ui is some open set. Here, S is the set of m- dimensional affine 
suhspaces of M" intersecting Ui. In the inner maximum prohlem ahove, 
we take the value to he if S D (lev< i ^(._|_„.-)/) n Ui is empty, making the 
ohjective function ahove equal to 0. For simplicity, we shall just assume 
that minimizers and maximizers of the ahove prohlem exist. 



(3) (Bisection) If the ohjective of (2.1 1 is zero, then ■^{li-\-Ui) is a lower hound 
of the critical value. Set li^i = ^{li -\- Ui) and Wi+i = Ui. Otherwise, set 
k+i = k and Uj+i = ^{li +Ui). 

(4) Increase i and go hack to step 2. 
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The critical step of Algorithm |2.3| lies in step 2. We elaborate on optimal con- 
ditions that will be a useful approximate for this step in Section |4] One may think 
of the set Ai as the relative boundary (to the afhne space Si) of Si n (lev>;./) n Ui. 
A frequent assumption we will make is nondegenericity. 

Definition 2.4. We say that a critical point is nondegenerate if its Hessian is 
invertible. 



Algorithm |2.3| requires 771 > 0, but when m = 0, nondegenerate critical points 
of Morse index zero are just strict local minimizers that can be easily found by 
optimization. We illustrate two special cases of Algorithm |2.3| 



Example 2.5. (Particular cases of Algorithm |2.3[) (a) For the case m = 1, 5 is 



the set of lines. The inner maximization problem in (2.11 has its solution on the 
two endpoints of Si D (lev> i (^.^jj.-)/) n Ui. This means that (2.1 1 is equivalent to 
finding the local closest points between two components of (lev< i H Ui, as 

was analyzed in jl2|. 

(b) For the case m ~ n, S contains the whole of M" . Hence the outer minimiza- 
tion problem in ( |2.1[ ) is superfluous. The level set (lev> i n Ui gets smaller 

and smaller as ^{li + Ui) approaches the maximum value, till it becomes a single 
point if the maximizer is unique. 



3. Convergence properties 

In this section, we prove the convergence of Xi, Ui in Algorithm |2 . 3 1 to a critical 
point when they converge to a common limit. We recall some facts about nonsmooth 
analysis needed for the rest of the paper. It is more economical to prove our result 
for nonsmooth critical points because the proofs are not that much harder, and 
nonsmooth critical points are also of interest in applications. 

Let X be a Banach space, and f : X M. he a, locally Lipschitz function at a 
given point x. 

Definition 3.1. (Clarke subdifferential) ^Si Section 2.1] Suppose / : X — > M is 
locally Lipschitz at x. The Clarke generalized directional derivative of / at x in the 
direction v Cz X is defined by 

,of ^ f{y + tv)-f{y) 

f (x;v) = hmsup , 

where y (z X and t is a positive scalar. The Clarke subdifferential of / at x, denoted 
by dcf{x), is the subset of the dual space X* given by 

{(eX* \ r{x-v) > {C,v} for all v e X} . 

The point a; is a Clarke (nonsmooth) critical point if G dcf{x). Here, (•, •) : 
X* X X — !• M defined by (C,w) := Ci'") is the dual relation. 

For the particular case of functions, dcf{x) — {V/(a;)}. Therefore critical 
points of smooth functions are also nonsmooth critical points. From the definitions 
above, it is clear that an equivalent definition of a nonsmooth critical point is 
f°{x;v) > for all v € X. This property allows us to prove that a point is 
nonsmooth critical without appealing to the dual space X* . 

We now prove our result of convergence to nonsmooth critical points. 
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Proposition 3.2. (Convergence to saddle point) Let z £ X. Suppose there is a 
ballM{z^r), a sequence of triples {{Si, Xi,yi)}°^^ and a sequence li monotonically 
increasing to f{z) such that {xi,yi) — >■ (z, z) and {Si,Xi,yi) is an optimizing triple 
of (2.1 1 in Algorithm 2.3 for li with Ui = B(z, r). Then z is a Clarke critical point. 

Proof. Seeking a contradiction, suppose there exists some direction v such that 
f°{z; v) < 0. This means that there is some e > such that if |z — z| < e and e < e, 
then 

< ir(.;.) 

^f{z + ev) < f{z) + e^f°{z;v). 

Suppose i is large enough so that Xi,yi € B(z, |), and that Xi,yi G Ai := Si D 
(lev>i;/) n B(z, r) are such that \xi — yi\ = diam(Ai). Consider the set A := 
{Si + eiv) n (lev>/./) nB(z, r), where ei > is arbitrarily small. Let Xi,yi G A be 
such that \xi — j/il — diam(A). From the minimality of the outer minimization, we 
have \xi - y^l > l^i - yi\. Note that f{xi) = f{yi) = I,. Then 

f{xt) < f{x^-eiv)+ei^f°{z;v) 

=^ f{x.,-eiv) > f{xi) - ei^f°{z;v) 
> k. 

The continuity of / implies that we can find some £2 > such that Xi := Xi — eiv + 
£2(^1 — Vi) lies in Ai. Similarly, yi := jji — eiV lie in Ai as well. But 

\xi-yt\ > \xi-yt\ 
> \xt-yz\. 

This contradicts the maximality of \xi — yi\ in Ai, and thus z must be a critical 
point. □ 



4. OpTIMALITY CONDITIONS 



We now reduce the min-max problem ( 2.1 1 to a condition on the gradients \7 f{xi) 



and y f{yi) that is easy to verify numerically. This condition will help in the 
numerical solution of ( |2.1| . We use methods in sensitivity analysis of optimization 
problems (as is done in [3]) to study how varying the m-dimensional affine space S in 
an (TO+l)-dimensional subspace affects the optimal value in the inner maximization 
problem in ( |2.1| . We conform as much as possible to the notation in [3 throughout 
this section. 

Consider the following parametric optimization problem {Pu) in terms of u G 



as an m + 1 dimensional model in M™+^ of the inner maximization problem in (2.1 1 

{Pu) ■ v{u) := min F{x,y,u) -.^ -\x - y\'^ 
s.t. G{x, y, u) G K, 
(4.1) x,ye«"'+\ 
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where G : 



)2 X M -> M'' and C M'* are defined by 



G{x,y, 



-f{x) + b 

-m + b 

(0,0, ...,0, It, l)a; 
V {0,0,...,0,u,l)y J 



K := 



{0}^ 



The problem {Pu) reflects the inner maximization problem of (2.11. Due to the 
standard practice of writing optimization problems as minimization problems, (4.1 ) 
is a minimization problem instead. We hope this does not cause confusion. 

Let S{u) be the m-dimensional subspace orthogonal to (0, . . . , 0, u, 1). The first 
two components of G(x,y,u) model the constraints f{x) > b and f{y) > b, while 
the last two components enforce x,y G S{u). Denote an optimal solution to (Pu) 
to be {x{u),y{u)), and let {x,y) := {x{0),y{0)). We make the following assumption 
throughout. 

Assumption 4.1. (Uniqueness of optimizers) (Pq) has a unique solution x ^ 
and y = (0, . . . , 0, 1, 0) at u = 0. 

We shall investigate how the set of minimizers of (Pu) behaves with respect to 
u at 0. 

The derivatives of F and G with respect to x and y, denoted by D^.yF and 
Dx,yG, are 

Dx,yF{x,y,u) = 2( {y-xf {x - y)^ ), 



(4.2) and Dx^yG{x,y,u) 



/ ^WfixV 
(0,0,..., 0, 7/, 1) 

V 



where the blank terms in Dx^yG{x, y, u) are all zero. 



The Lagrangian is the function L : M™+^ 



(0,0,...,0,?/,l) / 
: X M ^ M defined by 



L{x, y, A, u) := F{x, y,u) + ^ \^G^{x, y, \ 



We say that A (Ai, A2, A3. A4), depending on u, is a Lagrange multiplier if 
Dx.yL{x,y,\,u) = and A G NK{G(x,y,u)), and the set of all Lagrange mul- 
tipliers is denoted by A{x,y,u). Here, NK{G{x,y,u)) stands for the normal cone 
defined by 

NK{G{x,y,u)) {w e M'' | v^[w - G{x,y,u)] < for all w € K}. 

We are interested in the set A{x, y, 0). It is clear that optimal solutions must satisfy 
G{x, y, 0) = 0, so A e Nk{0) = x E^. 



The condition Dx^yL{x, y, A, 0) = reduces to 

Dx.y (F{x,y,0)+Y,^^G^{x,y,0) 



y-x 
x-y 



-V/(x) 




+ Ao 





-V/(?7) 



+A3 



(0,0,. ..,0,0,1)^ 




+ A4 



(0,0,..., 0,0,1)^ 
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Here, Gi{x,y,0) is the ith row of G{x,y,0) for 1 < « < 4. This is exactly the 
KKT conditions, and can be rewritten as 

2(y-5;)-AiV/(i) + A3(0,0,..., 0,0,1)^ = 0, 
(4.3) 2(x-y)-A2V/(y) + A4(0,0,..., 0,0,1)^ = 0. 

It is clear that Ai and A2 cannot be zero, and so we have 

2 A3^ 



VfixY = (^0,0,...,0,^^,^^ 

S/fiyf = fo,0,...,0,-f ,^ 

Recall that Ai, A2 > 0, so this gives more information about V/(x) and V/(y). 

We next discuss the optimality of the outer minimization problem of ( |2.1[ ) , which 
can be studied by perturbations in the parameter u of (4.11, but we first recall a 
result on the first order sensitivity of optimal solutions. 

Definition 4.2. (Robinson's constraint qualification) (from [31 Definition 2.86]) 
We say that Robinson's constraint qualification holds at {x,y) E x if 

the regularity condition 

e int {G{x, y, 0) + Range{D^^yG{x, y, 0)) - K} 

is satisfied. 

Theorem 4.3. (Parametric optimization) (from [3", Theorem 4.26]j For prob- 
^em(4.1l, let {x{u),y(u)) be as defined earlier. Suppose that 

(i) Robinson's constraint qualification holds at {x{0),y{0)), and 

(ii) if Un — >■ 0, then iPu„) possesses an optimal solution (x{un),y{u„)) that has 
a limit point (x, y). 

Then v{-) is directionally differentiable at u — and 

v'{0)^ DuL{x,y,X,0). 

We proceed to prove our result. 

Proposition 4.4. (Optimality condition on V f{y)) Consider the setup so far in 
this section and suppose Assumption 4.I holds. IfVf{y) is not a positive multiple 
of (0,0,..., 0, 1, 0)"^ at u = 0, then we can perturb u so that (4.1 1 has an increase 
in objective. 



Proof. We first obtain first order sensitivity information from Theorem |4.3[ Recall 
that by definition, Robinson's constraint qualification holds at (x, y) if 

e int {G(x, y, 0) + Ra.nge[D^^yG{x, y, 0)) - K} . 



From (4.3 ), it is clear that V/(x) and (0, . . . , 0, 0, 1) are linearly independent, and 
so are V/(y) and (0, . . . , 0, 0, 1). From the formula of Dx,yG{x,y,0) in (4.2|, we 
see immediately that Range (-Da;_yG(a;, y, 0)) = M.'^, thus the Robinson's constraint 
qualification indeed holds. 

Suppose that lim„^oo in = 0. We prove that part (ii) of Theorem 4.3 holds by 
proving that (a;(t„), y{tn)) cannot have any other limit points. Suppose that [x' , y') 
is a limit point of {{x{tn),y{tn))}'^^i- It is clear that x' ,y' G 5(0). 
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We can find yn ^ y such that ?/„ G S{tn) and f{yn) = b. For example, we can 
use the Iniphcit Function Tlieorem with the constraints 

fiy) = b, 

g{y,u) = 0, 

where g{y, u) = (0, 0, . . . , 0, u, l)'^y- The derivatives with respect to y^ and ym+i 
are 

g|:5(y,0) = 0, ^5(y,0) = l. 

Therefore, for yi = y2 — ■ ■ ■ = ym-2 — J/m-i = and any choice of u close to zero, 
there is some ym and ym+i such that y G S{u) and /(y) = b. 

Clearly \x — y„| < \x{tn) — yitn)\- Taking limits as n — >■ oo, we have |a; — y| < 
\x' — y'\. Since {x,y) minimize F, it follows that |a; — = \x' — y'\, and by the 
uniqueness of solutions to {Pq), we can assume that x' = x and y' = y. 

Theorem |4.3| implies that v'{0) = DuL{x, y, A, 0). We now calculate DuL{x, y, A, 0). 
It is clear that DuG{x,y, X,0) = (0,0,0,1)-^, and so DuL{x,y,X,0) — A4. Since 
Vf{y) is not a multiple of (0, 0, ... , 0, 1, 0)"^ at w = 0, A4 ^ 0, and this gives the 
conclusion we need. □ 

A direct consequence of Proposition |4.4| is the following easily checkable condi- 
tion. 

Theorem 4.5. (Gradients are opposite) Let {Si,Xi,yi) be an optimizing triple to 



(2.11 for some 1^ such that Si D (lev>;./) H Ui is closed, and {xi,yi) is the unique 
pair of points in Si H (lev>;./) H Ui satisfying \xi — yi\ = diam(S'i n (lev>;./) H Ui). 
Then \I f{xi) and V f{yi) are nonzero and point in opposite directions. 

Proof. We can look at an m + 1 dimensional subspace which reduces to the setting 



that we are considering so far in this section. By Proposition 4.4 V/(yi) is a 
positive multiple of Xi — yi at optimality. Similarly, V f{xi) is a positive multiple 
of yi — Xi at optimality, and the result follows. □ 

We remark on how to start the algorithm. We look at critical points of Morse 
index 1 first. In this case, two local minima xi, X2 are needed before the mountain 
pass algorithm can guarantee the existence of a critical point x^. For any value 
above the critical value corresponding to the critical point of Morse index 1, the 
level set contains a path connecting xi and X2 passing through x^. 

To find the next critical point of Morse index 2 we remark that under mild 
conditions, if lev<a/ contains a closed path homeomorphic to §1, the boundary of 
the disc of dimension 2, then the linking principle guarantees the existence of a 
critical point through the multidimensional mountain pass theorem. Theorem |6.1| 
which we quote later gives an idea how this is possible. We refer the reader to [TB] 
and [8, Chapter 19] for more details on linking methods. 

We now illustrate with an example that without the assumption that {xi,yi) 
is the unique pair of points satisfying \xi — yi\ — diam(5i H (lev>/./) n Ui), the 
conclusion in Theorem 14.51 need not hold. 



Lemma 4.6. (Shortest line segments) Suppose lines li and I2 intersect at the origin 
in M^, and let P be a point on the angle bisector as shown in the diagram on the left 
of Figure \4.1\ The minimum distance of the line segment AB, where A is a point 
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on ll and B is a point on I2 and AB passes through P , is attained when OAB is 
an isosceles triangle with AB as its base. 

Proof. Much of this is high school trigonometry and plane geometry, but we present 
full details for completeness. Let a be the angle /.AO P. /3 be the angle ^^PAO, 
and d ~ \0P\. By using the sine rule, we get 

, . , /s'ma sma 



sin 9 sm{n — 2a 

The problem is now reduced to finding the 9 that minimizes the value above. Con- 
tinuing the arithmetic gives: 

/ sin a sin a \ 

\ sm 9 sm[n — 2a — 9) J 

= d sin a 
= d sin a 
— d sin a 



1 


- 1 ) 


sm 9 


sin(2Q + 9) J 


sin 9 + 


sin{2a + 9)\ 


sm{9) 


sin(2a + 6') ) 


sin 9 + 


sin(2a + 6*) 



2dsin(2a) 



sin(6') sin(2a + 0) / 

2 sin (a + 9) cos a 
i[cos(2a) - cos(2a + 261)] 
sin(a + 9) 



cos(2a) - cos(2a + 26^) 



We now differentiate the — sin(a+fl) term above, which gives 

cos(2a) — cos(2a+2p) ' ^ 

d / sin(Q! + 9) 



d9 Vcos(2a) - cos(2q; + 261) ^ 

— ^ — [cosfa + 6')[cos(2a) - cos(2a + 26*)] - 2sin(2a + 261) sin(a - 

[cos(2a) - cos(2q; + 26i)]2 ^ 
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The numerator is simplified to be: 



cos(a + 6)[cos{2a) — cos(2q; -f 
cos(a + 9)[cos{2a) ~ 2cos^(q! 
cos(a + (?)[cos(2q;) — 2cos^(a 
cos(a + (?)[cos(2a) — 2cos^(a 
cos(a + (?)[cos(2q;) + 2cos^(a 



26*)] - 2 sin(2a + 261) sin(a + Q) 
+ 61) + 1] - 4sin2(a + 61) cos(a + B) 
+ 61) + 1 -4sin2(a + 6l)] 
+ 61) +4cos2(a + 6l) - 3] 
+ 0) -31. 



With this formula, we see that the conditions for ^ ^ 



ri(a+e) 



is to 



cos(2a)-cos(2Q+2e) 

have cos(q; + 6') = or cos(2a) + 2cos'^(a + 6') — 3. The first case gives w&B = f — a, 
which gives us the required conclusion. The second case requires a = or a = tt 
and = or 6^ = TT, which are degenerate cases. This gives us all optimum solutions 
to our problem, and concludes the proof. □ 

We now create an example in M'^ that illustrates that the omission of the condi- 
tion of unique solutions need not give us points whose gradients point in opposite 
directions. 



Example 4.7. (Gradients need not be opposite) Define the four lines L\ to L4 by 



Li 



= {(0,0,-l) + A(l,0,l) I AeM+} 

= {(0,0,-l) + A(-l,0,l) I AeM+} 

= {(0,0,1) + A(0,1,-1) I AeM+} 

= {(0,0,1) + A(0,-1,-1) I A e M+} 



The lines Li and L2 lie in the x-z plane, while the lines L3 and lie in the y-z 
plane. See the diagram on the right of Figure |4.1[ 

Consider first the problem of finding a plane S that is a minimizer of the maxi- 
mum of the distances between the points defined by the intersections of S and the 
Lj's. We now show that S has to be the x-y plane. The plane S intersects the z axis 
at some point (0, 0,p). When S is the x-y plane, the maximum distance between 
the points is 2. By Lemma |4?6] the distance between the points S C^Ll and S D L2 
is at least 2(1 — p), while the distance between the the points S D and 5 n L4 is 
at least 2(1 +p). This tells us that the x-y plane is optimal. 

With this observation, we now construct our example. Consider the function 



/ 



defined by 



f{x,y,z) = - 



1 + z 1- z 



4/3 



1 - Z 



4/3 



The level set lev>_2/ contains the lines Li to L4. This means that diam(5' n 
lev>_2/) > 2. This is in fact an equation when S is the x-y plane, and the 
maxiniizers being the pairs {±(1,0,0)} and {±(0, 1,0)}. 
The gradient V/(a;, y, z) is 



/ 



^f{x,y,z) 



+z 



V 3 (l+z + 1-^) ( (l+z)2 + (1-2)2) 3(1+2 1-2) ( 



y 
1-2 



y 
1-2 



1/3 



1/3 



y 

1-2 



y 

1-2 



1/3 



1/3 



1/3 



(1 + 2)2 



(1-2)0 / 
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With this, we can evaluate V/ at ±(1, 0, 0) and ±(0, 1, 0) to be 



Neither of the pairs {±(1, 0, 0)} and {±(0, 1, 0)} have opposite pointing gradients, 
which concludes our example. 

5. Another convergence property of critical points 

In this section, we look at a condition on critical points similar to Proposition 
3.2 that can be helpful for numerical methods for finding critical points. Theorem 
5.2 below does not seem to be easily found in the literature, and can be seen as a 
local version of the mountain pass theorem. 

We prove Theorem |5.2| in the more general setting of metric spaces. Such a treat- 
ment includes the case of nonsmooth functions. We recall the following definitions 
in metric critical point theory from [71 1101 . 

Definition 5.1. Let {X,d) be a metric space. We call the point x Morse regular 
for the function / : X — > M if, for some numbers 7, cr > 0, there is a continuous 
function 



and that (f>{-,Q) : B(a;,7) — > B(a;,7) is the identity map. The point x is Morse 
critical if it is not Morse regular. 

If for some (p, there is some k > such that ip also satisfies the inequality 



then we call x deformationally regular. The point x is deformationally critical if it 
is not deformationally regular. 

It is a fact that if X is a Banach space and / is locally Lipschitz, then deforma- 
tionally critical points are Clarke critical. The following theorem gives a strategy 
for identifying deformationally critical points. 

Theorem 5.2. (Critical points from sequences of linking sets) Let X he a metric 
space and / : X — > M. Suppose there is some open set U of x and sequences of sets 
and {r,}=^i such that 

(1) $i and dTi link. 

(2) Ti are homeomorphic to B™ for all i, and raaXxedTi fi^) < infajg^-na f{x). 

(3) For any open set V containing x, there is some I > such that Ti C V for 
all i > I . 

(4) / is Lipschitz in U . 




(j) : B(x,7) X [0,7] ^ X 
such that all points u G B(a;,7) and t £ [0,7] satisfy the inequality 

f{(f>ix,t))<f{x)-at, 



d{(t){x, t), x) < nt, 
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Then x is deformationally critical. 

Proof. Suppose x is deformationally regular. Then there are 7, cr, k > and (j) : 
M{x,j) X [0,7] — >■ X such that the inequalities 

f{(f){x, t)) < f{x) — at and d(^(f>{x, t), x) < nt 

hold for all points x € B(a;,7) and t £ [0,7], and (/)(•, 0) : 8(2:, 7) — > 8(5,7) is the 
identity map. We may reduce 7 as necessary and assume that U = B(a;,7). 

Condition (3) implies that for any a > 0, then there is some /i such that Ti C 
1(5, a) for all i > h. Consider T^^t (t)[[Yi x {t}) U {dVi x [0,i])). Provided 
< t < we have T,^t C U. 

Since / is Lipschitz in U, let R, be the modulus of Lipschitz continuity in U. We 
have maxajgTi fix) < maxj^ggpi fix) + Kdiam(ri). Also, 



max fix) < max ^ ni^x fix) + Kdiam(ri) — at, ma-x fix] 

By condition (3), there is some I2 > such that if i > /2, then diam(ri) < 
So for i > max(/i, I2), we have 

max fix) = max f(x) < inf f(a;). 

However, the fact that dVi and <i>i link implies that Yi t and $i must intersect, 
and since Fj^t C Vn and $j n ?7 must intersect. This is a contradiction, so x is 
deformationally critical. □ 

It is reasonable to choose Fj to be a simplex (that is, a convex hull of m + 1 
points) and $i to be an afhne space. If the sequence of sets {Fi}^]^ converges to 
the single point x and / is there, a quadratic approximation of / using only the 
knowledge of the values of / and V/ on the vertices of the simplex would be good 
approximation of / on the simplex. We outline our strategy below. 

Algorithm 5.3. (Obtaining unknowns in quadratic) Let h : M™ — > M 6e defined 
by hix) = ^x^ Ax + b^x + c, and let pi, . . . ,Pm+i be m + 1 points in M'". Suppose 
that the values of /i(pi) and V/i(pi) are known for all i — 1, . . . ,m + 1. We seek to 
obtain the values of A, b and c. 

(1) Let P G 5g ^/jg matrix such that the ith column is Pi+i — Pi, and let 
D G jjmx™ matrix such that the ith column is V/i(pi+i) — V/i(pi). 
Calculate A with A = DP~^. 

(2) Calculate b with b = V/i(pi) — Api. 

(3) Calculate c with c = /i(pi) — ^pf Api — b'^px. 



can 



If h is instead of being a quadratic, then the procedure in Algorithm 5.3 
be used to approximate the values of /i on a simplex. 

In Lemma |5.4| below, given m + 1 points in M", we need to approximate a 
quadratic function on A as a subset of M". Even though m < n, the procedure to 
obtain a quadratic estimate of / in A is a straightforward extension of Algorithm 

ESI 

Lemma 5.4. (Quadratic estimate on simplex) Let f ; M" R be and x G M". 
Let pi, . . . ,Pm+i be points close to x. Suppose the matrix P G M"^™, whose ith 
column is Pi+i — Pi, has full column rank. Let /e : M" n A -> M &e defined as 
the quadratic function obtained using fipi) and ^ fipi) for i — 1, . . . ,m + 1 with 
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Algorithm \5.3\ For any e > 0, there is some S > such that if pi, . . . ,p,n+i G 
M{x,S), then 

\Mx) - f(x)\ < ^diam(A)2e(l + k\\P\\\\P^\\) for all x € A, 

where \\ ■ \\ stands for the matrix 2-norm, is the pseudoinverse of P, and k is 
some constant dependent only on n and m. 



Proof. The first step of this proof is to show that step 1 of Algorithm |5.3| gives a 
matrix in M"^" which is a good approximation of how A — V^/(x) acts on the 
lineality space of the afhne hull of A. Since / is C^, for any e > 0, there exists <5 > 
such that |V/(x) - Vfix') - A{x - x')\ < e\x - x'\ for all x,x' € M{x,S). Thus, 
there is some k > depending only on m and n such that if pi, . . . ,Pm+i & V>{x, 6), 
then 

(5.1) \\D - AP\\ < Ke\\P\\. 

Let P = QR, where Q e K"""" has orthonormal columns and i? e M™""", be a 
QR decomposition of P. For any v G M" in the range of P, or equivalently, v — Qv' 
for some v' G K™, we want to show that \\Av — DR~^Q'^v\\ is small. We note that 
\v\ = \v'\, and we have the following calculation. 

\\Av - DR-^Q^v\\ = \\AQv' ~ DR-^Q^Qv'\\ 

= \\AQv' - DR-^v'\\ 

< \\AQ-DR-^\\\v'\ 

< \\AQR- D\\\\R'^\\\v\ 
= \\D-AP\\\\R-mv\ 

< k4P\\\\R-^\v\. 

Next, for x, x' G M{x, S), let d = unit(a;' — x). Then 

fix') -fix) = "'v/(x + sd)^dds 

Jo 

p\x'-x\ 

= / d^\j'^fix + td)ddt + \Jfix)^dds. 

Jo Jo 



Since / is C^, we may reduce 5 if necessary so that — fix + td)\\ < e for all 
<t< \x' -x\. This tells us that 

\d^iDR-^Q^)d- d^V^fix)d\ < \d\\\DR-^Q^d-V^fix)d\\ 

< \\DR-^Q^d- Ad\\ + \\Ad-V^fix)d\\ 

< Ke\\P\\\\R-4\d\ + \\A-V'fix)\\ 
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We have 



VfixY (x' -x) + 
Vfixfix' -x) + 



Wf{xf{x' -x) + -{x'- xf{DR'^Q'^){x' - x). 



d^ {DR-'Q' )dsds 

|2 

^-d^{DR-^Q'^)d 



Continuing with the arithmetic earher, we obtain 



\x ~x\ 



Vf[xf{x' ~x) + ]^{x' - xf{DR-'Q'^){x' - x) 



< 







e(l + K||P||||i?"i||)sdids 



2' 



\DR-^Q^){x' -x), 



x\\il + K\\P\\\\R-'\\). 
Let X = pi and x' be any point in A. Define fe{x') by 

Ux') - f{x) + V}{xf{x' -x) + ]^{x' 
which is the quadratic function obtained using Algorithm |5.3[ We have 

i' 

which gives what we need. 



\Ie{x')~f{x')\ 



< 



< 



x\\{1 + k\\P\\\\R-^\\) 
-di&m{Afe{l + n\\P\\\\R-^\\), 



□ 



In the statement of Lemma 5.4 we chose the domain of / to be M" so that the 
inequahty (5.1 1 follows from the equivalence of finite dimensional norms. Next, 
the accuracy of the computed values of V/(pi+i) — V/(pi) might be poor, which 
makes the quadratic approximation strategy ineffective once we are too close to the 
critical point x. We remark on how we can overcome this problem by exploiting 
concavity. 

Remark 5.5. (Exploiting concavity) The lineality space of the affine hull of A may 
span the eigenspaces of the m negative eigenvalues of V^/(a;) once we are close 
to the critical point x. This can be checked by calculating the Hessian as was 
done earlier. If this is the case, / would be concave in A when pi, . . . ,Pm+i £^re 
sufficiently close to x. The estimate f{x) < f{pi) + '^f{pi)'^{x — pi) would hold 
for all a; € A and 1 < i < + 1, which can give a sufficiently good estimate of 
maXa;£9A f{x) through linear programming. 

6. Fast local convergence 

In this section, we discuss how we can find good lower bounds that allow us 
to achieve better convergence if / is and X = M". Our method extends the 
local superlinearly convergent method in [12' for finding smooth critical points of 
mountain pass type when X — M". 
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Figure 6.1. Illustration of the multidimensional mountain pass theorem 
Let us recall the multidimensional mountain pass theorem due to Rabinowitz 

Theorem 6.1. (Multidimensional mountain pass theorem) [14 Let X ~ Y <S) Z be 
a Banach space with Z closed in X and dim(y) < oo. For p > define 

M:={ueY\ \\u\\ < p}, Mo := {u e Y \ \\u\\ = p}. 

Let f -.X be C\ and 

b := inf f{u) > a := max f{u). 

uez ueMo 

If f satisfies the Palais Smale condition and 

c := inf max f(j{u)) where T :— {■y : J\4 ^ X is continuous \ j\mo — j 
'yer ueM 

then c is a critical value of f . 

For the case when X — M.^, we have an illustration in Figure 6.1 of the case 
f -.R^ defined by f{x) 2- The critical point has critical value 

0. Choose Y to be x {0} and Z to be {0} x K. The union of the two blue cones 
is the level set lev=o/ ■= /~^(0), while the bold red ring denotes A^o and the red 
disc denotes a possible image of M under 7. 

It seems intuitively clear that y{Ai) has to intersect the vertical axis. This is 
indeed the case, since Aio and Z link. (See for example |17| Example II. 8. 2].) 

With this observation, we easily see that max^g^vi f{'y{u)) > mi^ez f{z)- Thus 
the critical value c = inf^gr max^g^ fili^)) from Theorem 6.1 is bounded from 
below by vaiz^z f{z)- This gives a lower bound for the critical value. In the 
mountain pass case when m = 1, the set A4o consists of two points, and the space 
Z separates the two points in A^o so that any path connecting the two points in 
A4o must intersect Z. 

A first try for a fast locally convergent algorithm is as follows: 
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Algorithm 6.2. A first try for a fast locally convergent algorithm to find saddle 
points of Morse index m for / ; X — > M. 

(1) Set the iteration count i to 0, and let k be a lower bound of the critical 
value. 

(2) Find Xi and yi, where (Si,Xi,yi) is an optimizing triple of 

(6.1) min max Ix — yl, 

SeS a;,yeSn(lev>,./)nC/, 

where Ui is an open set. Here S is the set of m-dimensional afhne subspaces 
of K" intersecting Ui. (The difference between this formula and (2.11 is that 
we take level sets of level k instead of ^{k + Ui).) 

(3) For an optimal solution (Si, Xi,yi), let li+i be the lower bound of / on the 
(n — r7i)-dimensional affine space passing through Zi := -^{xi + yi) whose 
lineality space is orthogonal to the lineality space of Si . 

(4) Increase i and go back to step 2. 

While Algorithm [6^ as stated works fine for the case m = 1 to find critical points of 
mountain pass type, the /^'s calculated in this manner need not increase monoton- 
ically to the critical value when m > 1. We first present a lemma on the min-max 
problem (6.1 1 for the case of a quadratic. 

Lemma 6.3. (Analysis on exact quadratic) Consider f : M" — > M defined by f{x) = 
^^3^% where Ui are in decreasing order, with aj > for 1 < j < n — m and 
aj < for n — m+1 < j < n. The function f has one critical point 0, and f(0) — 0. 
Given I < 0, an optimizing triple {S, x, y) of the problem 

(6.2) min max |a; — ?/|, 

S&S £c,i/6Snlcv>i/ 

where S is the set of affine spaces of dimension ni, satisfies 

x^ (o,o,...,o,±J ^ ,0,...,0 ) , 

where the nonzero term is in the (n — m + l)th position, and y — —x. 

Proof Let Sg,v '-^ {z + Vw \ w e M"}, where V E M"""" is a matrix with 
orthonormal columns. Let the matrix A E M"^" be the diagonal matrix with 
entries aj in the (j,j)th position. The ellipse Sgy H lev>// can be written as a 
union of elements of the form z + Vw, where w satisfies 

{z + VwfA{z + Vw) > I 

^ w^V^AVw + 2z'^AVw + z'^Az > I. 

If the matrix V'^AV has anonnegative eigenvalue, then S'z.ynlevx/ is unbounded. 
Otherwise, the set 

{z + Vw\ uFv'^AVw + 2z^AVw + z'^Az > 1} 

is bounded. Therefore the inner maximization problem of (6.2) corresponding to 
S — Sz.v has a (not necessarily unique) pair of minimizers. We continue completing 
the square with respect to w and let the symmetric matrix C be the square root 
C=[-V^AV]i. 

-w^C^w + 2z'^AVw + z^Az > I 
-{Cw-C-^V^Azf{Cw-C-W^Az) + z'^Az + z'^AVC-^V'^A^z > I. 
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The maximum length between two points of an eUipse is twice the distance between 
the center and the furthest point on the elHpse. (This fact is easily proved by 
reducing to, and examining, the two dimensional case.) The distance between the 
center and the furthest point on the ellipse Szy H lev>if can be calculated to be 

-{z^Az + zTAVC-W^ATz - 0, 
a 

where a is the square of the smallest eigenvalue in C, or equivalently the nega- 
tive of the largest eigenvalue of V'^AV. The term {z'^Az + z^ AVC~'^V'^ A^ z) is 
max{/(x) I X S Sgy}, which we refer to as maxg. ^ /. We now proceed to minimize 
maxgj ^, / and maximize a separately. 

Claim 1: maxg. ^, / > 0. 

We first prove that the subspace 

Z := {z\ Z„ = Zn-l = ■■■ = Zn-m+l = 0} 

must intersect Sz,v- Recall that V^AV is negative definite. Therefore for any 
w ^ 0, w^V"'" AVw < 0. Since the first n — m eigenvalues of A are positive, 
Vw cannot be all zeros in its last m components. This shows that the m x m 
matrix V{{n — m + 1) : n, 1 : m) is invertible. We can find some w such that 
the last m components of z + Vw are zeros. This shows that Sg^y n ^ 0, so 
maxsj / > min{/(a;) \ x E Z} = 0. 
Claim 2: a < —an-m+i- 

To find the maximum value of a, we recall that it is the negative of the largest 
eigenvalue of AV . Since V € K"^™, the Courant-Fischer Theorem, gives a < 

Q-n— m+l ■ 

Choose the affine space S := {0} x M™. This minimizes maxg / and maximizes 
a as well, giving the optimal solution in the statement of the lemma. □ 

It should be noted however that the minimizing subspace need not be unique, 
even if the values of aj are distinct. The example below highlights how Algorithm 
16.21 can fail. 



Example 6.4. (Failure of Algorithm 6.2 1 Suppose f{x) 3x1 The 



subspace S — {x \ Xi = x^} intersects the level set lev>_i/ in the disc 

A ( ^sin6',cos6l, ^sin6l) | < 6* < 2n,0 < A < 1 
\v2 \/2 / 

The largest distance between two points on the disc is 2, and the subspace S can 



be verified to give the optimal value to the min-max problem (6.1 ) by Lemma 6.3 
On the ray S-^ — {A(1,0, —1) | A € K}, the function / is concave, hence there is 
no minimum. This example illustrates that Algorithm |6.2| can fail in general. See 
Figure |0 



Example |6.4| shows that even if there are only 2 negative eigenvalues, it might 
be possible to find a two-dimensional subspace S on which the Hessian is negative 



definite on both S and 5*^. Therefore, we amend Algorithm 6.2 by determining the 
eigenspace corresponding to the m smallest eigenvalues. 

Algorithm 6.5. Fast local method to find saddle points of Morse index m. 

(1) Set the iteration count i to 0, and let li be a lower bound of the critical 
value. 



LEVEL SET METHODS FOR FINDING SADDLE POINTS OF GENERAL MORSE INDEX 19 




Figure 6.2. An example where Algorithm |6 . 2| fails . 



(2) Find Xi and yi, where {S[,Xi,yi) is an optimizing triple of 



(6.3) 



mm max \x — y\, 

S&S a;,j/eSn(lev>,./)nC/, 

where C/j is an open set. Here S is the set of m-dimensional afBne subspaces 
of M" intersecting C/j. We emphasize that the space where minimality is 
attained in the outer minimization problem is S*,'. After solving the above 
problem, find the subspace Si that approximates the eigenspace correspond- 



ing to the m smallest eigenvalues using Algorithm 6.6 below. 

(3) For an optimizing triple {Si,Xi,yi) found in step 2, let Z^+i be the lower 
bound of / on the (n — m)-dimensional affine space passing through Zj := 
^{xi + yi) whose lineality space is orthogonal to the lineality space of Si . 

(4) Increase i and go back to step 2 till convergence. 

A local algorithm is needed in Algorithm |6.5| to find the subspace Si in step 2. 
Algorithm 6.6. Finding the subspace Si in step 2 of Algorithm\6.S\ 



yi}, where Xi, yi are found in step 2 of Algorithm 6.5 



(1) Let Xi = spanjxi 
and let j be 1. 

(2) Find the closest point from Zi ^{xi + yi) to lev<:i.f D {zi + Xj-), which 
we call pj+i- 

(3) Let Xjj^i = span{Xj,pj_|_i — Zi} and increase j by 1. If j = m, let Si be 
Zi + Xm and the algorithm ends. Otherwise, go back to step 2. 

Step 2 of Algorithm |6.6| finds the negative eigenvalues and eigenvectors, starting 
from the eigenvalues furthest from zero. Once all the eigenvectors are found, then 
Si is the span of these eigenvectors. 

In some situations, the lineality space of Si are known in advance, or do not 
differ too much from the previous iteration. In this case, we can get around using 
Algorithm |6.6| and use the estimate instead. 

We are now ready to prove the convergence of Algorithm |6.5[ 
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7. Proof of superlinear convergence of local algorithm 

We prove our result on the convergence of Algorithm |6.5| in steps. The first step 
is to look closely at a model problem. 

Assumption 7.1. Given 6 > 0, suppose h : M" M is C^, 

\yh{x)-Ax\ < S\x\, 

and \h{x) - ^x'^Ax\ < ^S\xf for all x eM, 

where A G M"^" is an invertihle diagonal matrix with diagonal entries ordered 
decreasingly, of which ai — An and 

ai > 02 > • • • > a„-,n > > a„_„i+i > • • • > o„. 

Define h„,in : M" ^ M and h„,^^ ; M" ^ M by: 

It is clear that V/i(0) = 0, h{0) = 0, V^/i(0) = A, and the Morse index is to. 
Here is a simple observation that bounds the level sets of h: 

Proposition 7.2. (Level set property) The level sets of h satisfy 

B n lev>i/iinin C B n lev>;/i C B n lev>//lmax, 
and B n lev<i/iinax C B n lev<;/i C B n \ev<ih^i^. 

Proof. This follows easily from \h{x) — ^x'^Ax] < ^5\x\^ for all x e M. □ 

For convenience, we highlight the standard problem below: 

Problem 7.3. Suppose g : M" — > M is C^, with critical point of Morse index 
TO, g{0) — and the Hessian 'V'^g{0) has distinct eigenvalues that are all nonzero. 
Consider the problem 

min max |x — y|, 

SeS x,yeSn{\ev>ig)nR 

where S is the set of to dimensional affine subspaces. 



Note that in Problem |7.3[ we have limited the region where x and y lie in by B. 
Here is a result on the optimizing pair (x, y) of the inner maximization problem in 
Problem [731 



Lemma 7.4. (Convergence to eigenvector and saddle point)For all 5 > suffi- 
ciently small, suppose that h : M" — > K is such that Assumption ]?. 1\ holds. Assume 
that for the optimizing triple (S,x,y) of Problem 7.3 for g = h, {x,y) is the unique 
pair of points in n (lev>//i) n B such that \x — y\ = diam(5^n (lev>;ft,) flB). Then 
there exists e > such that if I < satisfies — e < / < 0, then {x,y) are such that 
l|5f| converges to the {n — m + l)th eigenvector as 6 ^ Q, and ||(a; + 
as (5 — > 0. 

Proof. Since lev<;/i,„ax H B C \cv<ih H B, we look at the the optimal solution of 
the min-max problem for/i^ax first. The objective of Problem 7.3 for g = /imax is 

This gives an upper bound for the min-max problem 



2y a„_„^i+^ by Lemma p 
for h. Similarly, by considering /imin instead, we deduce that Problem |7.3| has 
optimal solution bounded from below by 2 ' ' 
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Recall the optimality condition in Proposition |4.5[ We now proceed to find the 
first pair of points with opposite pointing gradients. Let x and y be optimal points 
at level U. Now, 

y — X = \iVh{x), 
and x-y = \2^h{y), 

for some Ai, A2 > 0. Then 

Xi\7h{x) + X2Vh{y) = 

\XiAx + X2Ay\ < Xi\Ax-Wh{x)\ + X2\Ay^\/h{y)\ 

< SiX^\x\ + X2\y\) 
\AiX,x + X2y)\ < S{Xi\x\ + X2\y\) 
^|Aix + A2y| < \A-^\\A{Xix + X2y)\ 

< \A-^\S{X,\x\ + X2\y\). 

This means that there are points x' and y' such that Xix' + X2y' = 0. \x — x'\ < 
\A~^\6\x\ and \y — y'\ < \A~^\6\y\. With this, we now concentrate on pairs of points 
that are negative multiples of each other. 
Now, 



XiWh{x) 
^ Vh{x) 



Ai 

< S\x\. 



y-x 

iy-x) 



Similarly, this gives us 



^2 



< S\y\. 



Therefore, 



This gives: 



A{y-x)+[y^+yJ{y-x) < S{\x\ + \y\). 



< \A{y' - x') - A(y _ x)| + (^i- + ^) \{y' - x') - {y - x)\ 

+ A{y-x) + [y^+y^ iy-^) 

< \A\\A~'\Si\x\ + \y\) + (^1 + 1^ \A-'\6i\x\ + \y\) + Si\x\ + \y\) 
(7.1) = Si\x\ + \y\)(^\A\\A~'\+(^j- + ^yA-'\ + iy 



(1) 
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Next, we relate \y — x\ and + \y\. We have 

\x\ < \x'\ + \A-^\6\x\ 
^{l-\A-'\S)\x\ < \x'\, 

and similarly, (1 — < \y'\. It is clear that \y' — x'\ = \x'\ + \y'\, and we 

get 



m + \y\ < 



\A-'\S 
il'2\A-'\S)i\x\ 



< 



< 



1 


1 - 






1 


1 - 


\A-^\S 




1 


1 - 


\A-^S 




1 


1 - 


\A-^d 


\y- 


x\. 



\y 



\y-x\ + \A-'\ 



\y-x\ 



To show that (1) in ( |7.1[ ) converges to as (5 \ 0, we need to show that (xj- + x^) 
remains bounded as (5 \ 0. Note that 



Vh{x) - Vh{y) 

1 ^ 1 
Ai A2 



{y-x) 



1 ^ 1 

Ai A2 
\Vh{x) -Vh{y)\ 



< 



< 



1 



\y~x\ 
1 

\y-x\ 



\y-x\ 

[\A{x~y)\+5{\x\ + \y\ 
1 



\A\\x-y\ + 



1 -2|yl-i|(5 



l-2\A-^6' 



Since the eigenvectors depend continuously on the entries of a matrix when the 
eigenvalues remain distinct, we see that |-^-| [y' — x') converges to an eigenvector 
of A as S ^ from formula ( |7.1| . 

Next, we show that (y~x) converges to an eigenvector corresponding to the 

\y -^1 

eigenvalue a„_,„+i. Recall that 2y^ ^ \i-s — 1^ ~ 2/1 ^^^^ ~ v) ~ ~ y')\ — 
2\A^^\S. So (y — x) has the same limit as ^y^^^ {y' — x'). If x' and y' are such 
that |^3jy(y' — x') converges to a eigenvector corresponding to a^, then Lemma 
below gives us the following chain of inequalities: 

\x-y\ < \x\ + \y\ 



7.5 



< 2J[(l + 0)2 + (n-l) 



'(afe+(5)(l-0)2 + (7i-l)(ai+5)02' 
where — >■ as (5 \ 0. We note that k > n — m + 1 because Ofc cannot be 



nonnegative. As 5 \ 0, the limit of the RHS of the above is 2y This gives a 
contradiction ifA;>7i — m + 1, sofc^n — m + 1. 
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To show i(x + y) — > as (5 \ 0: We now work out an upper bound for 



|(a; + y)| using Lemma 7.5 We get 
1 



< 



< 



< 



1 



A- 



m + \y\ 



^'W\-\y'\\ + l\A-'\S{\x\ + \y\) 



\S:\-\y\\ + \A-'\6{\x\ + \y\) 
^[(1 + 0)2 + (ti- 1)021 

1. „, I 



I 



(a„-m+i + <5)(1 - 0)2 + (n - l)(ai + 6)6^ 

A-'\Si\x\ + \y\). 



I 



2' (a„_,„+i - ,5)(1 + 0)2 + (n-l)(a„- ,5)02 

Here, > is such that — > as (5 0. At this point, we note that the final formula 
above can be written as ^ ~ \/^) ^ l^^^l'^d^l ~^ I^^D' '^^ere ci, C2 < 0, with 

|ci| < |c2|, and ci,C2 — > fln-m+i as (5 — > 0. Therefore 

1 



< 



I 

J 

Cl C2 



A-i|5(|5;| + |y|) 



|^-i|,5(|5;| + |y|) 



< 



< 



1 1(02 -Cl) 



2ciC2 



C2 - Cl /T 



\A-'\6{\x\ + \y\). 



It is clear that as (5 — >■ 0, the above formula goes to zero, so + ?;)|2 
(5 — >■ as needed. 



— > as 
□ 



In Lemma |7.5| below, we say that is the ith elementary vector if it is the 
ith column of the identity matrix. It is also the eigenvector corresponding to the 
eigenvalue of A. 



Lemma 7.5. (Length estimates of vectors) Let h 



and A G 



satisfy 



Assumption 7.1 for some 5 > 0. Suppose h{x) = h{y) — I < 0. Suppose 9 > is 



such that \dx — e/cloo < for some dg pointing in the same direction as x, and that 
the same relation holds for y. 

Then \x\ and \y\ are bounded from below and above by 



(1-0). 



I 



(afc-,5)(l + 0)2 + (n-l)(a„-,5)0^ 



|S|,|y|<A [(1 + 0)2 + 1)02 



I 



(flfc + ,5)(1 - 0)2 + (n-l)(ai+ ,5)02 
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Proof. Necessarily we must have k > n — m + 1 because if fc < n — ?7i + 1, then 
the eigenvalue a^. is positive, making the direction and its nearby directions 
directions of ascent. Let d := d.^. From \d~ ek\oo < S. we obtain 1 — 6<dk<l + 
and \dj \ < 6 for j ^ k. 

For the direction d, we find the largest and smallest value for the norm of x if 
unit(a;) = unit(d). This is also the largest and smallest possible value of t such that 
h{t unit(c?)) = I. Here, unit(-) maps a nonzero vector to the unit vector of the same 
direction. Now 



h{tumt{d)) < ^{ai - S)[t unit{d)i] 

4=1 

dl_ 



- J^[^[i^k-5){l + 0f + {n-l){ar,-6)6% 



Since h{t \mit{d)) ~ I, we have 
Next, 



I 



iak-5)il + ey + {n-l)ian-S)e^ 



h{t unit((i)) > ^(fli + S)[t umt{d)i 



i=l 



2^7 = 1 



J 4 = 1 

Again, since h{t unit((i)) = Z, we have 

^ ^ + + - 1)^^] . IV ^AW 

□ 

Here are some lemmas on the completion of orthogonal matrices. In Lemmas 



7.6 and 7.7 let | • | denote a norm for matrices (which need not be a matrix norm). 



Lemma 7.6. (Completion to orthogonal matrix) Let Ek G M"^*^ he the first k 
columns of the n x n identity matrix. Then for all e > 0, there exists 6 > such 
that if Vk € M"^*^ has orthonormal columns and \Vk ~ Ek\ < 5 , then Vk can be 
completed to an orthogonal matrix Vn £ M"^" such that \I — Vn\ < e. 



The above lemma is an easy consequence of the following result. 
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Lemma 7.7. (Finding orthogonal vector) For all e > 0, there exists a 5 > such 
that if Vk G K"^*^ has orthonormal columns and jV^ — E}J\ < i5 , then there is a 
vector Vk+i € such that |wfe+i|2 = 1 and is orthogonal to all columns ofVk, and 
the concatenation Vk+i := [Vfe,Vfe+i] satisfies \Vk+i — Ek+i\ < e. 

Proof. Since all finite dimensional norms are equivalent, we can assume that the 
norm |-| on M"^*^, M"><('=+i) is the oo-norm for vectors, that is \M\ = maxij \M{i,j)\. 
Suppose \Vk - Ek\ < 6. Then \Vk{i,j)\ < 5 iii^ j and \Vk{i,i) -1\ < 5. We now 
construct the vector Vk+i using the Gram-Schmidt process. 

The direction of Ufc+i obtained by the Gram-Schmidt process is: 

(/ - VkV^)ek+i = Cfc+i - VkV^ek+i 

k 

= Cfe+i - ^Vfc(i,fc + l)Vfc(:,i). 

i=l 

Since |Vfe(i, j)| < S for all i ^ j, the sum ak+i € M" defined by ak+i = Yli=i Vk{i, k+ 
^Wki'-^i) has components obeying the bounds 

I /-M / / kS^ if J >k + l, 

K+i(j)|<| ^f,-l)S' + S ifi<fc + l. 

Then Vk+i = unit(e^,+i + a^+i). We first analyze the maximum error in Vk+i{j) 
for j ^ k + 1. The 2— norm of ek+i + ctk+i is at least 

|efc+i+afc+i|2 > |efe+i|2 - |q:/c+i|2 



> 



Y,[ak+iii)? 



> 1 - ^/{n - k)kd^ +k{{k- 1)(52 + d) 

> 1 - \/nkS^ + k5. 

If (5 < min{l, 4(„^i)fc }; then ^nkS'^ + kS < i, and so \ek+i + afe+ijj > 5. In this 
case, the maximum error in Vk+i{j) is 2{{k — 1)5"^ + 5) for j ^ k. For j = fc -|- 1, we 
note that |wfe+i|2 = 1, which tells us that 

[vk+iik + 1)]^ = 1- Yl K+iWI' 

l<i<.n.i^k 

> l-4(n-l)[(A:-l)(52 + ,5]2 
^Vk+iik + 1) > Vl-4(n-l)[(fc-l)<52 + 5]2 
= ^/l-'iin-l)5^[{k- 1)6 + 1]^. 
U5 < min{l, ^^^^}, then 4(n - l),52[(fc - l)S + 1]^ < 1, which gives 

Vk+i{k + l) > Vl-4(n-l)(52[(fc-l)5+l]2 
> l-4(n-l)(52[(A;-l)(5+l]2 
\vk+i{k + l)-l\ < 4(n-l)^2[(A;-l)5 + l]2. 

If 5 < min{^, ik^f^}, then 

\[Vk,Vk+i\-Ek+i\ < max{2[(A;-l)(52+5],4(n-l)(52[(fc-l)(5 + l]2} 
< e, 
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and we are done. □ 

The next result shows that we get a closer estimate to the critical value after 
each iteration of step 2 in Algorithm |6.5| 



Lemma 7.8. (Lower bound on critical value) Let S > be sufficiently small, and 
suppose /i : M" M satisfies Assumption 7.1 Let Sz,v '■— {z + Vw \ w G M"^™}, 
and V e ]]jnx(ri-m) such that V has orthonormal columns, with \z — 0\ < 5 and 
\V — En-m\ < 6, where En-m G is the first n — m columns of the identity 

matrix. Then 

-l\A- SI\ (1 + \[V^iA 5I)V]-'\\V^\\A SI\)' |zp < min his). 

Proof. We find a lower bound for the smallest value of h on Sgy. The function 
hgy : M"-™ R defined by h^yiw) := h{z + Vw) satisfies 

hzy{w) = h{z + Vw) 

> ^{z + Vw)^{A-5I){z + Vw). 

Let us denote hgy^nun ■ K""™ ^ M by /ij,v,mi„(w) = l{z + Vw)^ (A- SI){z + Vw). 
The Hessian of hz.v,min is V'^{A — SI)V, which tells us that hz.v,min is strictly 
convex. Therefore, we seek to find the minimizer of hgy^min- 

The minimizing value of w, which we denote as Wmin, satisfies V/iz^v'^niin(u'min) — 
0. This gives us 

V^{A-SI)z + V^{A~SL)Vw^in = 

^wmin = -[V'^iA-6I)V]'W^{A-5I)z. 

An easy bound on \wmin\ is jiSminl < | - (5/)y]"^| A - 5/| |z|. So /ij.y^min 

is bounded from below by 



mmhzy,niniw) = 

W 


min i(z + Vw)'^{A - 51){z + Vw) 
w 2 






^{Z + Vw^in)^{A - SI){Z + Vw^in) 




> 


-^\Z + T^Wmin \\A-6I\\Z + Vw^in \ 






A\A-SI\\z + Vw^in\'' 




> 


-^\A~SI\[\z\ + \Vw,^,^\r 






-^\A^6I\[\z\ + \w^,^\r 




> 


-^\A- SI\{\-z\ + \[V^iA- SI)V]-'\\V^\\A 


-SI\\z\)' 


(7.2) 


-^\A-5I\ {1 + \[V^{A-SI)V]~'^\ \V^\\A- 


SI\y\-z\' 



□ 

We shall prove Lemma |7.9| about the approximation of the eigenvectors corre- 
sponding to the smallest eigenvalues. This lemma analyzes Algorithm |6.6| We 
clarify our notation. In the case of an exact quadratic, Algorithm |6.5| first finds 
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e-n-m+i, then invokes Algorithm 6.6 to find the eigenvector e„, followed by e„_i, 
e„_2 and so on, all the way to e„_m+2- 

We define Ik and as subsets of {1, . . . , n} by 

Ik '■= {n — m + 1} (J {n ~ k + 2, n — k + 3, . . . , n} 

It := 

Next, we define i?^ and E^- The matrix E'^ e M"^*^ has the k columns e„„,„_|_i, e„, 
e„_i, e„_fc+2, while the matrix G M"^("^''') contains all the other columns 
in the n x n identity matrix. The columns of i?^ and E^ are chosen from the nx n 
identity matrix from the index sets Ik and 1^ respectively. 

We will need to analyze the eigenvalues of {Vit)'^{A ± in the proof of 

LemmalTj] where IV^-*- - E^l is small. Note that the matrix (E^)'^ {A± SI)Et is 



principle minor of A± 51, and its eigenvalues are the eigenvalues of A ± SI chosen 
according to the index set I^- Furthermore, 

liVk^f (A ±SI)Vt iEif{A±SI)Ei\ 

< \{Vk^f{A±SI)Vt -iVk^f{A±SI)Et\ 

+ \{Vk^fiA ± 5I)Et - {EtfiA ± 5I)Et\ 

< \{Vk^fiA±SI)\\Vk^~Ei\ 

+ \iVk^f~iEif\\iA±SI)Ei\ 

It is clear that as IV^- ~ E^l ^ 0, \(yft)'^{A ± 6I)Vit - {E^fiA ± 5I)Et \ 0. 
The eigenvalues of a matrix varies continuously with respect to the entries when the 
eigenvalues are distinct, so we shall let di denote the eigenvalue of [VftY" {A+5I)Vt' 
that is closest to a^, and hi denote the eigenvalue of {Vty^ {A— 5I)Vf}- that is closest 
to ai. 

Lemma 7.9. (Estimates of eigenvectors to negative eigenvalues) Let h : M" — > M. 
Given a fixed I < sufficiently close to 0, let p be the closest point to z in the set 
leY<ihnSgy±, where Sgy± {z + V^-w \ w e and V^- € M"x("-*=). Then 

for all e > 0, there exists 6 > such that if 



(1) h : M" — > R satisfies Assumption 7.1 

(2) \z\ < 5 and 

(3) IVjt — E^l < S, where Vf^ has orthogonal columns, 

then |unit(p ~ z) ~ e„_j._|_i| < e. As a consequence, \Vj^ — K^J — > as 5 — )• 0. 

Proof. The first step is to find an upper bound on the distance between z and p. 
The upper bound is obtained from looking at the closest distance between z and 

lev<j/lmax n Sg y±. 

We look at the function ^2_yi_,„ax • IR"^'^ R defined by hgy±^^^{w) :— 
Knax{z + Vk"^)- We have 

= l{z + Vk^wf{A + SI){z + Vk^w) 
=^ ^hy^^,^,^{w) = {Vtf{A + 5I)z + {Vtf{A + 5I)Vtw. 

The critical point of h-, y^^^^^^ is thus w„,ax -[{¥^7 {A + 5I)Vft]-^{Vt-V {A + 
5I)z. The critical value corresponding to this is y± niax(''^max)- An upper bound 
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for this critical value is + Vi^w\ \ A + i5/| \z + V^w\, which is in turn bounded 
by: 

^\z + V^^w\\A + SI\\z + Vk^w\ 
= ^-\A + 6I\\z + V,^w\' 



< 



\ \A + 5I\ (1 + I [{V^^r{A + 5I)V^^r' I \{V^^r\ \A + 5I\f \-z[^ 
(following calculations similar to that of (7.2 1). 



Then an upper bound of the distance (i(z, lev<;/iinax ^ S^y±) can be calculated by: 



rf(z, n lev<;/i,„ax) 

< \z- {z + Vj^'-Winax) \ + d{z + V^W^a^y,, Sg y± H lev<;/i; 



max J 



< \Wr, 




l-^\A + 6I\{l + \ [{V.^ViA + SI)V,^]-^ I |(y,^n \A + 6I\) 



\2 I 



Z 



O.n-k + 1 + S 



The extra term — ^ |A + (5/| • • • |zp compensates for the fact that the critical 
value of /ij jjjjjx is not necessarily zero. To simplify notation, let (3 be the right 
hand side of the above formula as marked. 

We now figure the possible intersection between ]B(z,/3) and Sgy± fl lev<;/i. 
Again, since Sgy± n lev<//i C Sgy±- n lev<//imin, we look at the intersection of 
B(z, /3) and y± n lev<i/ii„in. We find the critical point of y± j^jjj : M"^*^ M 
defined by hgy±^^-^:^^{w) :— \[z + V^w)'^{A — 51){z + Vj^w). The gradient of 

y± jjjjjj can be found to be 

^h,y,^^n.M = (Vtf(A - 6I)z + {V,^f{A - 5I)V^w. 
Once again, the critical point is u;,ni„ = [{Vf}-Y {A - 5I)Vj}-]^^{V^Y [A - 5I)z. So 

B(Z, /?) C B(Z + l/^lD^in, /3 + |w„,in|). 

Consider p £ M{z + V^Wi^in, P + \wnun\) r\ (S g^y± r\lev<ih^in) ■ Let us introduce a 
change of coordinates such that p = X^ie/^ PiVi+Wmin, where Vi S K"~'^ correspond 
to the eigenvectors of {Vi;^)'^{A — <5/)Vjj'- (in turn corresponding to the eigenvalues 
di) and pi G M. are the multipliers. Then the condition z + Vj^p G M{z+Vj^il!ynin, P + 
|u)inin|) and z + Vj^p G ^zVj^ ^ lev<;/iinin Can be represented as the following 
constraints respectively: 



E 



pI < (/3+kmi„|)', 



(7.3) ^ ^ + + Vi^Wn^infiA - 51){z + V^W^,^). 



As (5 — > 0, the only admissible solution is Pn-k+i = y g — and the rest of the 
Pi's are zero. The above constraints are linear in pf. We consider the minimum 
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possible value of = 'tV , which is the dot product between the unit 

vectors in the direction of Vn-k+i and p — (z + Vj^Wmin)- This is equivalent to the 
linear fractional program in fif of minimizing "^pV subject to the constraints in 

This linear fractional program can be transformed into a linear program by 
q = jyi^ and q., = which gives: 

min q„-k+i 

a > 



(7.4) s.t. 

(7.5) 
(7.6) 



1 



(/3 



1 



E^eI^^^(l^ ^ I + + Vf^-W^inf {A - dl){z + V,,^W„,in) 

q, > for all i e li . 



q, 



The constraints of the linear program above gives 



E 



-a-iqi 



E 

-5,, + a„_fc)% 



> 



> 



1 



> - 



l+-iz + Vk^W,^iny {A - 5I){Z + Ffe^lD^in) 



/ + - (Z + V^Wn,i^f{A - 5I){Z + V^W^ir,) 

+ V^w^-,,,,Y{A - 81){z + V^w^-^,,) 



q + a„ 



Since only —an-k+i + fln-fc is positive and the other —hi + a„_fc are nonpositive, 
we have 



{ — O.n-k+1 + 0-n-k)qn-k 



k + 1 



> 



E 



> - 



(-a,j + an-k)qi 
1 + W + Vitw^^ViA - 51){z + V^w^,,,,) 



1n-k+l > 



1 



1+1{Z + V,^W^,ir,V{A - 6I){Z + l/fc^tDniin) 



-flri-fe+l + fin-fe \ (/3 + |Wmin|)^ 

The limit of the right hand side goes to 1 as (5 0, so this means that z — p is 
close to the direction of the eigenvector corresponding to the eigenvalue an-k+i 
in {Vj}')'^ AVf}- , which in turn converges to Cn-k+i- The proof of this lemma is 
complete. 

The conclusion that | — E^^ | — > as (5 ^> follows from the first part of this 
lemma and Lemma 17.61 □ 



With these lemmas set up, we are now ready to prove the fast local conver- 
gence of Algorithm |6.5| to the critical point and critical value. We recall that 
Q-linear convergence of a sequence of positive numbers {ai}°^i converging to zero 
is defined by limsupj_j.g^ ^^^i±i_ ^ ^ while Q-superlinear convergence is defined by 



lim,; 



aj+l 



= 0. Next, R-linear convergence and R-superlinear convergence of a 
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sequence are defined by being bounded by a Q-linearly convergent sequence and a 
Q-superlinearly convergent sequence respectively. 



Theorem 7.10. (Fast convergence of Algorithm 6.5) Suppose that J : M" — )• M is 
and is a nondegenerate critical point of f of Morse index m, and h(0) = 0. 
There is some R > such that if < r < R, then for Ui = B(0,r) and Iq < 



(depending on r) sufficiently close to 0, Algorithm 6.5 converges R-superlinearly to 
the critical point and Q-superlinearly to the critical value provided that at each 
iteration, there exists an optimizing triple {Si,Xi,yi) for which {xi,yi) is the unique 
pair of points Si D (leyyi.f) nB(0,r) such that \xi — yi\ = diam(5i H (lev>i;/) H 
B(0,r)). 

// a sufficiently good approximate for the lineality space of each Si is available 
in step 2 of Algorithm |g.5| instead, then Algorithm \6.5\ converges R-linearly to the 
critical point and Q-linearly to the critical value. 

Proof. As a reminder, Q-linear convergence of the critical value is defined to be 
lim supj_j.o(3 ^^|7^ < oo, and Q-superlinear convergence of the critical value is defined 

to be linij_j.oo ^-^jjrf^ — 0. From the Q-linear (Q-superlinear) convergence of /j, we 
obtain the R-linear (R-superlinear) convergence of the critical point by observing 



that limsupj_j.g^ < oo, and that y'lhl converges Q-linearly (Q-superlinearly). 
Since / is C^, for all 6 > 0, we can find R> 0, such that 



\f{x) ~ x^ Ax\ < S\x\^ for all x G B(0, R). 



7.1 



The function fj^ : M" K defined by fnix) := -^f{Rx) satisfies Assumption 
with A := V^fnix) = V^ix). 

We want to show that if (5 > is sufficiently small, then for all / < sufficiently 
small, a step in Algorithm |6.5| gives good convergence to the critical value. Given 
an iterate Z,;, the next iterate Zj+i is 

min f{x) — min R^fffix), 
where V^j^, which approximates the first n—m eigenvectors, is defined before Lemma 

EM 

We seek to find ^-^jfy'. The value of Z^+i depends on how well the last m eigen- 
vectors are approximated, and how well the critical point is estimated, which in 
turn depends on 6. The ratio ^^jj-p is bounded from above by 

- 5I\ (1 + Im^fiA - 6I)V^]-'\ \iV^f\\A SI\f \z,\yh, 

which converges to as (5 ^> by Lemmas |7.4[ |7.6| |7.8| and |7.9[ 

The conclusion in the second part of the theorem follows a similar analysis. □ 

8. Conclusion and conjectures 

In this paper, we present a strategy to find saddle points of general Morse index, 
extending the algorithms for finding critical points of mountain pass type as was 



done in |12| . Algorithms 6.5 and 6.6 may not be easily implementable, especially 
when m is large. However, Algorithm |6.6| can be performed only as needed in 
a practical implementation. It is hoped that this strategy can augment current 
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methods for finding saddle points, and can serve as a foundation for further research 
on effective methods of finding saddle points. 
Here are some conjectures: 

• How do the algorithms presented fare in real problems? Are there difficul- 
ties in the infinite dimensional case when implementing Algorithm IG-Sf 

• Are there ways to integrate Algorithms |6.5| and |6.6| to give a better algo- 
rithm? 

• Are there better algorithms than Algorithm |6.6| to approximate 5;? 

• Can the uniqueness assumption in Theorem 1 7 . 1 0| be lifted? If not, how does 
it affect the design of algorithms? 
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